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Abstract. The object of this research was to develop and evaluate a formal 
measure of shape similarity that could predict human performance in recognizing 
and discriminating highly similar complex shapes and objects. The similarity 
measure is the correlation in activation values over a lattice of columns of Gabor 
filters (termed “Gabor Jets”). Each column is composed of a number of filters at 
different scales and orientations, all centered on the same position in the visual 
field, similar to cortical hypercolumns (Lades et al., 1993). For amoeboid blobs, 
infrared images of tanks, and faces, high correlations were obtained between the 
similarity measure and psychophysical discrimination/identification performance. 

To our knowledge, this is the first time that the similarity of complex shapes has 
been determined directly from a measure of the shapes themselves. A different 
similarity basis is required, however, when distinctive nonaccidental properties 
are not available. We present rigorous evidence for the employment of NAPs in 
object recognition (Biederman & Bar, 1998) and the preferential tuning of cells in 
the inferotemporal cortex to NAPs (Vogels, Biederman, bar, and Lorincz. 2000). 

These findings have provided the bases for an integrated account of basic- and 
subordinate level object classification combining geon theory with the Gabor Jet 
model (Biederman & Kalocsai; 1987). 

Overview of Major Results 

The object of this research was to develop and evaluate a formal measure of shape similarity 
that could predict human performance in recognizing and discriminating highly similar complex 
shapes and objects. Remarkably, we succeeded. The similarity measure is the correlation in 
activation values over a lattice of columns of Gabor filters (termed “Gabor Jets”). Each column 
is composed of a number of filters at different scales and orientations, all centered on the same 
position in the visual field. Such a scheme is presumed to be a simplified model of early cortical 
filtering, where the hypercolumns are arranged in a lattice superimposed over the image (Lades 
et al., 1993). In a sequential same-different task of a subset of the more similar pairs of the 
Shepard & Cermak (1973) toroidal "free-form" blobby shapes, the similarity measure correlated 
in the mid -.90s with the RTs and error rates in judging that a pair of shapes were different, 
without estimating any free parameters (Biederman & Subramaniam, 1997). To our knowledge, 
this is the first time that the similarity of complex shapes has been determined directly from a 
measure of the shapes themselves. High correlations were also noted (Biederman, Kalocsai, & 
O'Kane, 1998) between the measure of shape and psychophysical measures of face similarity 
(Biederman and Kalocsai (1997) and in the confusion matrix when identifying infrared images of 
military vehicles (O’Kane, Biederman, Cooper, & Nystrom, 1997). The Gabor Jet model enjoys 
its greatest success for difficult discriminations when distinctive nonaccidental properties are not 
available. The ability of the Gabor Jet model to predict discrimination performance is greatly 
reduced when objects are viewed at different orientations in depth and/or nonaccidental (i.e., 
viewpoint invariant) properties are available. Several studies have provided rigorous evidence 
for the employment of NAPs in object recognition (Biederman & Bar, 1998) and the preferential 
tuning of cells in the inferotemporal cortex to NAPs (Vogels, Biederman, bar, and Lorincz. 
2000). These findings have provided the bases for a theory that integrates geon theory with the 
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recognition of basic and subordinate level object classification (Biederman & Kalocsai; 1987; 
Biederman, et al., 2000). 

We have investigated whether a computer vision system for face recognition developed 
by von der Malsburg and his associates (Lades et al., 1993; Wiscott, et al., 1997; see Fiser, 
Biederman, & Cooper, 1996, for a more extended overview) could model human performance 
when making difficult shape discriminations. We were particularly interested in this system 
because it was designed to reflect the early cortical processing of images and it was successful at 
actually doing face recognition. A version of this system has recently won a DARPA national 
competition for face recognition systems, with a level of performance that would have been 
undreamed of just a few years ago: In a gallery of several thousand faces, the system was 95% 
correct in matching a probe face to a target that could differ moderately in pose and expression. 
The question was whether it would provide a basis for for representing the psychophysical 
similarity of complex shapes, such as faces, infrared images of military vehicles, and unfamiliar 
blobs in shape discrimination and recognition tasks. We also explored the conditions under 
which it would not predict psychophysical performance. 


The Gabor Jet Model. The model (Figure 1) first filters the image according to columns 
of Gabor filters, termed a Gabor "jet.” Each jet contains filters at, say, 5 scales and 8 
orientations, for a total of 40 kernels (in this example). (Each kernel actually comes as sine and 
cosine but we will ignore that aspect in this brief overview.) Each jet is centered on a position in 
the visual field at the vertices of a lattice of, say, 6 rows X 8 columns, that covers the object. 
[There is automatic normalization for position, size, and overall contrast.] The lower frequency 
kernels cover much of the object and thus are highly sensitive to the configuration of features. 
To put it in neural terms, each jet would correspond to a hypercolumn of the initial spatial 
processing in the hypercolumns in the visual cortex. The activation of each cell (= kernel) is a 
function of the tuning (i.e., receptive field) of that cell to the orientation and scale of the contrast 
in that portion of the visual field. 


Hiters 


Gabor wavelets 
(8 orientations, 5 scales) 


Feature (Jet) 



set of 8x5 filter responses 
at one image location 

Coni}' 4x3 jic represented. berej 


Model (Graph) 



grid of 4x6 connected jets 
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Figure 1. Illustration of the input layer to the Lades et al. (1993) network. The basic kernels are Gabor filters at 
different scales and orientations, two of which are shown on the left. The center figure illustrates the composition of 
a jet, with the larger disks representing lower spatial frequencies. The number of jets, scales, and orientation can be 
varied. (From Biederman & Kalocsai, 1997.) 

A similarity measure is derived by correlating, for each jet in the image, the amplitudes 
of the 40 kernels of that jet, with the corresponding kernels of the corresponding jet in a stored 
image. This correlation can be expressed as the cosine between two-40 element vectors. Given 
a set of stored images, the one with the highest correlation to the probe image, if it exceeds some 
minimum value, would be taken to be the best match. An option is to allow simulated annealing 
in which the jets can undergo modest variation in their position, though at some cost, to 
maximize the correlation. The final positions for nonidentical images would result in a 
deformation of the original lattice as shown in Fig. 6 for tanks and Fig. 7 for faces. Although 
such a stage results in higher absolute simiilarity values, the ordering of the similarity of the 
images is only rarely affected by the annealing. Thus, if image A is more similar to image B 
than it is to C before annealing, almost always the same relations will hold after annealing. I 
have included the annealing stage in Figure 6 and 7 to illustrate the best fitting distorted meshes. 

PREDICTING THE PSYCHOPHYSICAL SIMILARITY OF NOVEL 

COMPLEX SHAPES 

1. Discriminating the Shepard and Cermak (1973) shapes. 

We (Biederman & Subramaniam, 1997) have demonstrated that the Gabor Jet model can 
predict, virtually perfectly, the speed and accuracy required to discriminate (as Same or 
Different) a sequentially presented pair of asymmetrical, unfamiliar, blobby shapes, devised by 
Shepard and Cermak (1973) and shown in Fig. 2. The Gabor Jet scaling of the distances of the 
shapes is illustrated in Fig. 3. We used a task illustrated in Fig. 4. (A similar sequential same- 
different task was used to assess the degree to which similarity values derived from the Gabor Jet 
model would predict psychophysical similarity of faces.) 

As illustrated in Figure 4, subjects had to judge whether a pair of highly similar shapes 
were same or different. For similar shapes (those pairs with a similarity value of .82 and higher) 
the correlation between model similarity values and RTs and error rates on Different trials was 
.96 and .95, respectively, as shown in Fig. 5. These are extraordinary high values given that no 
free parameters were fixed according to the performance of the subjects. In fact, it is the first 
time that performance in discriminating complex shapes was predicted quantitatively from a 
theoretical model. 

When we investigated the full range of similarities, from 68 to 95, a bilinear relationship 
was evident, as is shown in Fig. 6. Why? Consider in Fig. 2, the difference in the left lobes 
between HI and F4. Hi’s is squarish; F4’s is pointy. That difference, a difference in 
nonaccidental properties (NAPs) is typical between those pairs of shapes that are more dissimilar 
(i.e., lower) than 82. (NAPs are properties of images that do not change with rotation in depth, 
such as whether a line is straight or curved.) Shepard and Cermak (1973) had their subject 
provide interpretations of the shapes in Fig. 2. Very close shapes tended to differ only 
metrically, e.g., in the degree of bulge of a lobe. Such shapes tended to have the same 
interpretation, e.g., as a bird looking left; a stone axe; a cat looking right; Africa. Shapes at a 
greater distance, i.e., lower Gabor Jet similarity, tended to have different interpretations. By our 
scaling, the average Gabor Jet distance at which the interpretation would change was .82. The 
implication of this result is that a change in a NAP invites a change in a visual concept. 
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Figure 2. The complete set of stimuli used in the experiments. The bottom row (J) is identical to the the top 
row (A) and the rightmost column (10) is identical to the leftmost column (1) implying that the array is topologically 
equivalent to the two dimensional surface of a torus or a doughnut. (Adapted from Shepard & Cermak, 1973.) 



Figure 3. Illustration of four of the 81 Shepard and Cermak (1973) free-form shapes. The similarity values from 
the Gabor Jet model (Lades et al., 1993) for each of the shapes with respect to A are shown in the shapes. 
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Example of a "Different" Trial 



Figure 4. Illustration of a single (“Different”) trial from Subramaniam, et. al (2000) for a pair of shapes with a 
similarity value of .89. The experiment assessed the correlation between similarity values derived from the Gabor 
Jet model for a set of free form shapes and human performance in discriminating those shapes. In the actual 
experiment, S2 was displaced slightly from SI, even on Same trials so that subjects could not use a displacement cue 
to make their judgment that the shapes were different. 
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Lades et al., (1993) Model Similarities vs. RTs 



Lades et al., (1993) Model Similarities vs. Errors 
Experiment II 



Figure 5. RTs and error rates plotted as a function of the Lades et al., (1993) wavelet similarity measure over the 
restricted range. The best fitting lines for the data are also shown. Correlations were 0.96 and 0.95 for RTs and 
error rates, respectively. 
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Figure 6. RTs and error rates plotted as a function of the Lades et al., (1993) wavelet similarity measure. The best 
fitting lines highlighting the bilinear structure of the data are also shown. Correlations were 0.87 and 0.80 for RTs 
and error rates, respectively. 
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2. Identifying Infrared Images of Military Vehicles 

O'Kane, Biederman, Cooper, &Nystrom (1997) studied the confusion errors that trained 
military observers, e.g., tank crews, made in attempting to identify infrared images of 15 military 
vehicles, such as tanks, APCs, jeeps, and trucks. A similarity tree (not shown) was generated 
based on judged differences in the parts and the distinctiveness of the differences, with 
distinctiveness defined in terms of the scale and nonaccidentalness of the difference. That is, 
large and/or nonaccidental differences in shape occupied a higher position in the tree (therefore 
were at greater nodal distances) than small and/or metric differences. The fewer the number of 
nodes separating two vehicles, the greater the number of confusions. Although such a tree 
provides an excellent means of instruction and was strongly correlated with contusion errors, it 
was possible to account for the confusion errors by scaling the differences according to the 
Gabor Jet model. Figure 7 provides an some sample images from the experiment and their 
confusion rates with a T62 tank. The similarity values were obtained from close up and low 
noise images as shown in Fig. 7 but the data came from an identification task performed on 
images taken from much greater distances and very high noise levels as shown in Fig. 8 
(O’Kane, Biederman, & Cooper, 1997). The Gabor Jet similarity measure correlated .81 with 
the confusion matrix. This result suggests that the model preserves essential characteristics of 
shape under degraded viewing conditions. This value, .81, is uncorrected for the unreliability in 
the confusion matrix itself. That is, if we were to replicate the experiment, we would not find the 
exact same values in the confusion matrix but they would differ somewhat from the original 
values. This variability is error variance and connot be predicted. Consequently, the value of r = 
.81, which thus accounts for 64% of the variance (r 2 = proportion of variance predicted), is a 
lower bounds estimate of the proportion of the total variance(= predictable variance + error 
variance) that is predictable by the Gabor Jet measure. 

3. Matching Faces without Easy Features 

Biederman and Kalocsai (1987) investigated same-different judgment of faces that were 
either at different orientations, different expressions, or both. The faces on a given trial were 
of the same sex, age, and race with the hairline concealed and had no easy distinguishing 
features such as facial hair, glasses, etc, as shown in Fig. 9. In a task similar to that shown in 
Fig. 4, subjects judged whether the pair of images were the same individual or not. Different 
orientations and expressions reduced the similarity of the images Pairs of images were scaled 
according to the Gabor Jet model. For Same trials, RTs and error rates were highly 
negatively correlated (-.90+) with the Gabor Jet similarity values. 
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Ml Mean Confusion = 8.6% 

Figure 7. The confusion rates for three tanks with the T62. The original reference image is the T62 showing the 
regular rectangular mesh (7 X 14 in this case) with the jets positioned at each of the nodes according to the Gabor 
Jet model (Lades et al, 1993). The best fitting jet positions for the comparison tanks produce a distorted mesh with 
the magnitude of the distortion reflecting the degree of dissimilarity. An identical image matched against itself 
would have a similarity value of 1. The values for the T55, M551, and Ml were .85, .79, and .80, respectively. In 
this case, the model did not reflect the relatively high confusion rate of the M551 with the T62 compared to the Ml. 
In general, however, the correlation between model similarity value and confusion rate was quite high = .81. From 
O’Kane, B. L., Biederman, I., Cooper, E. E., & Kalocsai, P. (1996). 
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Figure 8. Sample of near (left column) and far ranges (right column) and low noise (top row) and high noise 
(bottom row) used in the O'Kane, B. L., Biederman, I., Cooper, E. E., & Kalocsai, P. (1996) experiments. 
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\ 

Figure 9. The positioning of the (Lades et al. 5 1993) lattice over an original image is shown in the left-hand column 
(a). The result of the diffusion over a pair of faces is shown in the right column (b). From top to bottom, the rows 
illustrate a change in expression, orientation, and both expression and orientatoin. In general the more distorted the 
grid, the more dissimilar the iamges of the two faces. In a task where subjects have to judge whether two faces 
images are of the same person, the Gabor Jet similarity values are well correlated with the RTs and error rates in 
judging that two images are of the same person: The greater the similarity, the easier it is to judge that two images 
of the same individual are the same person and the more difficult it is to judge that images of different people are, in 
fact, different. (From Kalocsai & Biederman, 1996.) 
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IV. Comparing NAPs and MPs in Object Recognition 

Bar, M., & Biederman, I. (1999). One-shot viewpoint invariance in matching novel objects. 

Vision Research, 39, 2885-2899.. 

Humans show an extraordinary competence at recognizing objects from arbitrary orientations 
in depth. According to one class of theories, this competence is based on previously having 
learned different templates, expressing the metric properties (MPs) at the different orientations. 
An alternative class of theories assumes that nonaccidental properties (NAPs) can be exploited 
so that even novel objects can be recognized under depth rotation. Same-Different judgments of 
a sequential pair of novel rotated objects, differing in a MP or a NAP (when different) (Fig. 10), 
viewed only once by each subject, revealed complete depth invariance when objects differed in a 
NAP. The sequence of events on a trial are illustrated in Fig. 11. Following a press of the 
mouse button, a fixation dot appeared for 500 ms, followed by a 400 ms presentation of the 
object, which was then immediately followed by a mask consisting of a combination of different 
gray-level objects presented for 500 ms. A second object image was then presented for 300 ms, 
followed by a second 500 ms mask. The second stimulus was translated randomly over nine 
possible positions on the screen, specified by a three by three matrix with adjacent horizontal or 
vertical centers separated by 6.8°. Thus, the second image could be above or below, and/or, to 
the right or to the left of the first image which was always centered. Rotation dramatically 
reduced the detectability of MP differences to a level well below that expected by chance (12). 
NAPs offer a striking advantage over MPs for object classification and are therefore more likely 
to play a central role in the representation of objects. 

Vogels, R., Biederman, I., Bar, M, & Lorincz, A. (2000). Inferior temporal neurons show 

greater sensitivity to nonaccidental than metric differences. Journal of Cognitive 

Neuroscience, in press.. 

It has long been known that macaque inferior temporal (IT) neurons tend to fire more 
strongly to some shapes than to others, and that different IT neurons can show markedly different 
shape preferences. Beyond the discovery that these preferences can be elicited by features of 
moderate complexity, no general principle of (non-face) object recognition had emerged by 
which this enormous variation in selectivity could be understood. Psychophysical as well as 
computational work suggests that one such principle is the difference between viewpoint- 
invariant, non-accidental (NAP) and view-dependent, metric shape properties (MPs). Vogels et 
al. (2000) measured the responses of single IT neurons to objects differing in either a NAP 
(namely, a change in a geon) or a MP of a single part, shown at two orientations in depth. The 
images were those from Biederman and Bar (1999). The cells were more sensitive to changes in 
NAPs than in MPs, even though the image variation (as assessed by wavelet-like measures) 
produced by the former were smaller than the latter. The magnitude of the response modulation 
from the rotation itself was, on average, similar to that produced by the NAP differences, 
although the image changes from the rotation were much greater than that produced by NAP 
differences. Multidimensional scaling of the neural responses indicated a NAP/MP dimension, 
independent of an orientation dimension. The present results thus demonstrate that a significant 
portion of the neural code of IT cells represents differences in NAPs rather than MPs. This code 
may enable immediate recognition of novel objects at new views. 
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(mean) 




FIG. 10 An example of possible variations from the original version in the calibration and rotation phases. The VIP 
change is one of a curved to straight cylinder. The MP change is a change in the degree of curvature of the cylinder. 
In the calibration phase (0°) the objects are depicted from identical orientations, with the magnitude of the VIP and 
MP changes selected to yield equal detectabilty as shown on the 0° value in Fig. 4. In the rotation phase, objects 
were rotated (an average of) 57°. The differences in surface lightness between the two orientations is a consequence 
of a single light source used in the rendering (which provided a potential cue as to the degree of rotation). 
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Mask 
500 msec 



FIG. 11 Sequence of events on an experimental trial. An illustration of a VIP DIFFERENT trial in the rotation 
phase of the experiment. The sequence would be the same in the calibration phase. Note the shift of the second 
object to the lower right, relative to the position of the first object. 
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FIG. 12. Mean error rates (upper panel) and mean correct reaction times (lower panel) for 10 subjects as a function 
of differences in orientation and the type of difference (Same, MP Different, and VIP Different). RTs greater than 
2,000 msec were counted as errors. Error bars are the S.E.s when variance attributable to main effects of subjects 
and objects are removed. 
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recognition. Pp. 3-25. In H. Wechsler, P. J. Phillips, V. Bruce, F. F. Soulie, & T. Huang 
(Eds.), Face Recognition: From Theory to Applications. NATO ASI Series F, Springer- 
Verlag. 

Biederman, I., & Kalocsai, P. (1997). Neurocomputational bases of object and face recognition. 
Philosophical Transactions of the Royal Society London: Biological Sciences, 352, 1203- 
1219. 

Biederman, I., Cooper, E. E., & Hummel J. E. (1997). Recognition-by-geons: 1997's current 
progress and current challenges. Image and Vision Computing, 15, 280-284. 

Kalocsai, P., & Biederman, I. (1997). Biologically inspired recognition model with extension 
fields. Proceedings of the 4th Joint Symposium on Neural Computation, pp. 116-123, 
University of California, San Diego. 

O'Kane, B., Biederman, I., Cooper, E. E., & Nystrom, B. (1997). An account of object 
identification confusions. Journal of Experimental Psychology: Applied, 3, 21-41. 

Kalocsai, P., & Biederman, I. (1997). Biologically inspired recognition model with extension 
fields. Proceedings of the 4th Joint Symposium on Neural Computation, pp. 116-123, 
University of California, San Diego. 

Fiser, J., Biederman, I., & Cooper, E. E. (1996). To what extent can matching algorithms based 
on direct outputs of spatial filters account for human shape recognition? Spatial Vision, 10, 

Biederman, I., & Gerhardstein, P. C. (1995). Viewpoint-dependent mechanisms in visual object 
recognition: Reply to Tarr and Biilthoff (1995). Journal of Experimental Psychology: Human 
Perception and Performance, 21,1506-1514. 

Fiser, J., & Biederman, I. (1995). Size invariance in visual object priming of gray scale images. 
Perception, 24, 741-748. 

Fiser, J., Biederman, I., & Cooper, E. E. (1995). Test of a two-layer network as a model of 
human entry-level object recognition. Pp. 391-396. J. M. Bower (Ed.) The Neurobiology of 


Page 17 



Biederman: Final Progress Report ARO DAAH04-94-G-0065 July 6, 2000 


Computation: Proceedings of the Third Annual Computational Neuroscience Meeting. 
Boston: Kluewer. 

Biederman, I. (1995). Visual object recognition. In S. M. Kosslyn and D. N. Osherson (Eds.). 
An Invitation to Cognitive Science, 2nd edition, Volume 2, Visual Cognition. MIT Press. 
Chapter 4, pp. 121-165. 

Biederman, I. (1995). Geon theory as an account of shape recognition in mind, brain, and 
network. Cognitive Studies: Bulletin of the Japanese Cognitive Science Society, 2, 46-59. 

Biederman, I. (1995). Some Problems of Visual Shape Recognition to Which the Application of 
Clustering Mathematics Might Yield Some Potential Benefits. In I. J. Cox, P. Hansen, B. 
Julesz (Eds.) Partitioning Data Sets, Pp. 313-329. Providence, R. I.: American 
Mathematical Society. 

CONFERENCE AND SYMPOSIA PRESENTATIONS 

Biederman, I. (2000). Shape recognition in mind and brain. Invited address at a symposium on 
object recognition at the International Congress of Psychology, Stockholm, July. 

Mangini, M. C., Biederman, I., Kosta, A. (2000). Is greater accuracy for gender than person 
discrimination of faces a consequence of class uncertainty? Evidence from normals and a 
prosopagnosic. Poster presented at the Meetings of the Association for Research in Vision 
and Ophthalmology, Ft. Lauderdale, FL., May. Investigative Ophthalmology & Visual 
Science, 41, 225. 

Vessel, E. A., & Biederman, I. (2000). Brightness judgments within minimal part types are 
easier than between part types. Poster presented at the Meetings of the Association for 
Research in Vision and Ophthalmology, Ft. Lauderdale, FL., May. Investigative 
Ophthalmology & Visual Science, 41, 226. 

Biederman, I. (2000). Human face and object recognition in vertebrates (man and macaque). 
Invited paper presented at a Workshop on Recognition of Visual Patterns and Landmarks by 
Insects. Delmenhorst, Germany, March. 

Kosta, A., & Biederman, I. (1999). Does variability in the size of an object’s parts facilitate 
recognition? Paper presented at the 7 th Annual Workshop on Object Perception and Memory. 
Los Angeles.. Nov. 

Vessel, E. A., Mangini, M. C., & Biederman, I. (1999). Experts vs. novices performing 
subordinate RSVP identification. Paper presented at the 7 th Annual Workshop on Object 
Perception and Memory. Los Angeles. Nov. 

Mangini, M. C., & Biederman, I. (1999). Do objects with many parts incur greater attentional 
costs than objects with few parts? Poster presented at the 7“' Annual Workshop on Object 
Perception and Memory. Los Angeles. Nov. 

Vogels, R., Biederman, I., & Bar, M. (1999) Sensitivity of macaque temporal neurons to 
variations in object shading. Paper presented at the Meetings of the Society for Neuroscience, 
Miami, FL. Nov. 


Page 18 



Biederman: Final Progress Report ARO DAAH04-94-G-0065 July 6, 2000 


Biederman, I. (1999). An Evaluation of "View-Based" vs. Geon Structural Descriptions as 
Alternative Accounts of Visual Object Recognition. Invited paper presented at the 2nd IEEE 
Workshop on Generic Object Recognition, Corfu Greece, September. 

Biederman, I. (1999). Aiding image analysts through RSVP training and displays. Invited 
presentation at a Meeting of Neuroscience Inspired Target Recognition, The Neuroscience 
Institute, La Jolla, CA. September 

Biederman, I. (1999). Recognizing Depth-Rotated Objects: A Review of Recent Research and 
Theory. Invited paper at Workshop on Visual Object Recognition by Humans and Machines, 
Bad Homburg, Germany, May. 

Vogels, R., Biederman, I., & Bar, M. (1999) Sensitivity of macaque temporal neurons to 
differences in view-invariant vs. metric properties of depth-rotated objects. Paper presented 
at the Meetings of the Association for Research in Vision and Ophthalmology, Ft. Lauderdale, 
FL., May. {Investigative Ophthalmology & Visual Science, 40, S776.) 

Vessel, E. A., Subramaniam, S., & Biederman, I. (1999). A change in contrast polarity at an L- 
Junction unbinds its segments. Poster presented at the Meetings of the Association for 
Research in Vision and Ophthalmology, Ft. Lauderdale, FL., May. Investigative 
Ophthalmology & Visual Science, 40, 810. 

Mangini, M. C., Biederman, I., & Williams, E. (1999). The effect of test-context junction 
discontinuities in perceived lightness. Poster presented at the Meetings of the Association for 
Research in Vision and Ophthalmology, Ft. Lauderdale, FL., May. Investigative 
Ophthalmology & Visual Science, 40, 747. 

Biederman, I., & Bar, M. (1998). Cortical localization of subliminal visual priming. Paper 
presented at the Annual Meeting of the Psychonomic Society. Dallas, Nov. 

Peissig, J. J., Young, M. E., Jr., Wasserman, E. A., & Biederman, I. (1998). The pigeon's 
discrimination of single geons. Paper presented at the Annual Meeting of the Psychonomic 
Society. Dallas, Nov. 

Mangini, M. C., Biederman, I., and Williams, E. K. (1998). Perceived lightness as a measure of 
perceptual grouping. Paper presented at the Annual Object Perception and Memory Meeting. 
Dallas, Nov. 

Vessel, E., Subramaniam, S., & Biederman, I. (1998). When does variation in contrast polarity 
affect contour grouping in object recognition? Paper presented at the Annual Object 
Perception and Memory Meeting. Dallas, Nov. 

Sary, G., Kovacs, G., Koteles, K., Benedek, G., Fiser, J., & Biederman, I. (1988). Selectivity 
variations in monkey inferior temporal neurons for intact and contour-deleted line drawings. 
Poster presented at the Meetings of the Society for Neuroscience, Los Angeles, CA, 
November. 

Bar, M., & Biederman, I. (1998). Subliminal visual priming transfers within but not between 
visual quadrants. Poster presented at the Meetings of the Society for Neuroscience, Los 
Angeles, CA, November. 
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Biederman, I. (1998). The neurocomputational basis of face and object recognition. Invited 
presentation at the Stockholm Workshop on Computational Vision, Rosenon, August 4-7, 
1998. 

Biederman, I., & Bar, M. (1998). Same-different matching of depth-rotated objects. . Paper 
presented at the Meetings of the Association for Research in Vision and Ophthalmology, Ft. 
Lauderdale, FL., May. Investigative Ophthalmology & Visual Science , 39, 1113. 

Bar, M., & Biederman, I. (1998). Evidence that representations mediating subliminal visual 
priming are localized in an intermediate visual area such as V4. Poster presented at the 
Meetings of the Association for Research in Vision and Ophthalmology, Ft. Lauderdale, FL., 
May. Investigative Ophthalmology & Visual Science, 39, 1113. 

Subramaniam, S., Yokosawa, K., & Biederman, I. (1998). Vertex binding and attention to 2-D 
shapes. Poster presented at the Meetings of the Association for Research in Vision and 
Ophthalmology, Ft. Lauderdale, FL., May. Investigative Ophthalmology & Visual Science, 
39, 854. 

Biederman, I. (1998). A neurocomputational basis for the difference in the representation of 
faces and objects. Invited presentation at the Third Annual Cognitive Science Symposium, 
University of California, Riverside. 

Biederman, I. (1998). Why faces and objects are represented differently: A neurocomputational 
analysis. Invited address presented at the Inaugural Conference for the Institut des Sciences 
Cognitives, Lyon, France. April. 

Biederman, I. (1998). Three-dimensional object representation and recognition. Invited 
position paper at a Symposium on Visual Object Recognition: Theory and Experiment 
(VORTEX). Los Angeles, CA, February. 

Bar, M., & Biederman, I. (1997). Subliminal visual priming. Paper presented at the Sixth 
Annual meeting of the Israel Society for Neurosciences, Eliat, Israel, December. 

Biederman, I., & Bar, M. (1997). What's the fuss about perceiving depth-rotated objects? Paper 
presented at the Meetings of the Psychonomics Society, Philadelphia, PA, November. 

Biederman, I. (1997). Why separate fMRI loci for the recognition of faces and objects? Invited 
presentation at a Symposium on Neural Imaging, University of Michigan, October. 

Biederman, I. (1997). Invited seminar presented at the NATO Advanced Study Institute (ASI) 
on 'Face Recognition: From Theory to Applications, Stirling, Scotland, UK, June 23-July 4. 

M. Bar and I. Biederman. (1997). Subliminal Visual Priming. In The First Conference of the 
Association for the Scientific Study of Consciousness. Claremont, CA. June. 

Biederman, I., & Subramaniam, S. (1997). Predicting the shape similarity of objects without 
distinguishing viewpoint invariant properties (VIPs) or parts. Investigative Ophthalmology & 
Visual Science, 38, 998. 

Bar, M., & Biederman, I. (1997). The robustness of subliminal visual priming over time and 
intervening trials. Investigative Ophthalmology & Visual Science, 38, 1005. 

Fiser, J., & Biederman, I. (1997). Independence of visual priming to hemisphere, scale, and 
reflection changes. Investigative Ophthalmology & Visual Science, 38,1005. 
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Kalocsai, I., & Biederman, I. (1997). Biologically inspired recognition model with horizontal 
connections and extension fields. Investigative Ophthalmology & Visual Science, 38, 1000. 

Subramaniam, S. & Biederman, I. (1997). Does contrast reversal affect object identification. 
Investigative Ophthalmology & Visual Science, 38, 998 

Biederman, I. (1997). Invited address to Conference on Vision and Visual Cognition, 
Copenhagen, April 25-27. 

Biederman, I. (1997). Invited discussant at a CIBA Foundation Workshop on Vision, London, 
U. K., February, 14. 

Biederman, I. (1997). Neurocomputational Bases of Face and Object Recognition. Invited 
address presented at a Meeting on Knowledge Based Vision, The Royal Society (London), 
February 12-13. 

Biederman, I (1996). Invited presentation at U.S. Air Force Conference on New Frontiers in 
Sensor Applications. Albuquerque, New Mexico. November. 

Biederman, I., & Kalocsai, P. (1996). Face but not object representations preserve the original 
Fourier components. Paper presented at the Meetings of the Psychonomics Society, Chicago, 
Nov. 

Kalocsai, P., & Biederman, I. (1996). Addition of horizontal connections and extension fields to 
a low level object recognition model qualitatively improves its performance. Paper presented 
at a Meeting on Object Perception and Memory, Chicago, Nov. 

Bar, M., & Biederman, I. (1996). Subliminal visual priming. Paper presented at a Meeting on 
Object Perception and Memory, Chicago, Nov. 

Fiser. J., Subramaniam, S., & Biederman, I. (1996). Coarse-to-fine tuning on object 
recognition: Size or scale. Paper presented at the European Conference of Visual Perception, 
Strasbourg, France, Sept. 

Biederman, I. (1996). Applied aspects of shape recognition research. Invited address presented 
at the Attention & Performance Conference, Haifa, Israel, July. 

Biederman, I. (1996). Perceiving function. Invited address presented at the Computer Vision 
Symposium. San Francisco, CA, June. 

Biederman, I. (1996). A neural computational account of real-time object and face recognition. 
Invited paper presented to a Symposium on Cognition and Neuroscience. University of 
Michigan, Ann Arbor, Michigan, May. 

Fiser, J., Subramaniam, S., & Biederman, I. (1996). The effect of changing size and spatial 
frequency content of gray-scale object images in RSVP identification tasks. Investigative 
Ophthalmology & Visual Science , 37, 178. 

Kalocsai, P., Biederman, I., Fiser, J., & Fang, P. (1996). Differences between object and face 
recognition in utilizing early visual information. Investigative Ophthalmology & Visual 
Science, 37, 176. 
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Bar, M., & Biederman, I. (1996). Is subliminal priming visual? Is it translationally invariant? 
Investigative Ophthalmology & Visual Science, 37, 178. 

Yokosawa, K., Subramaniam, S., Biederman, I. (1996). Independence of perceptual and 
semantic features in object verification. Investigative Ophthalmology & Visual Science, 37, 
178. 

O'Kane, B. L., Biederman, I., Cooper, E. E., & Kalocsai, P. (1996). Modeling parameters for 
target identification: Spatial filters vs. critical features. Paper presented at the IRIS 
conference, Monterey, CA. 

O'Kane, B. L., Biederman, I., Cooper, E. E., & Kalocsai, P. (1996). Spatial filter and geon 
models as a biological framework for the identification of thermal signatures. Paper 
presented at the Meetings of the Army Science Board, Newport News, VA. 

Biederman, I., & Bar, M. (1995). One-Shot Viewpoint Invariance with Nonsense Objects. 
Paper presented at the Annual Meeting of the Psychonomic Society, 1995, Los Angeles, 
November. 

Kirkpatrick-Steger, K., Wasserman, E. A., & Biederman, I. (1995). Effects of deletion, 
movement, and scrambling of object components on picture perception in pigeons. Paper 
presented at the Annual Meeting of the Psychonomic Society, Los Angeles, November. 

Fiser, J., & Biederman, I. (1995). Do spatial frequency and orientation information contribute 
similarly to visual object priming? Paper presented at the Third Annual Workshop on Object 
Perception and Memory. Los Angeles, Nov 

Bar, M., & Biederman, I. (1995). Immediate use of viewpoint invariant information for 
matching depth-rotated objects. Paper presented at the Third Annual Workshop on Object 
Perception and Memory. Los Angeles, Nov. 

Biederman, I. (1995). Recognition of Faces and Objects: Speculations on a General Theory of 
Shape Recognition. Invited presentation to the Workshop on Face and Object Recognition, 
Cardiff, Wales, Oct. 

Biederman, I. (1995). A neural-computational theory of perceptual and cognitive pleasure. 
Invited address to the Welsh Branch of the British Psychological Society, Cardiff, Wales. 
Oct. 

Biederman, I. (1995). Binding and object recognition. Invited paper presented at a Symposium 
on Phenomena and Architectures of Cognitive Dynamics. Leipzig, Germany, June. 

Biederman, I., & Kalocsai, P. (1995). The psychophysics of face recognition. Invited paper at 
the International Workshop on Automatic Face- and Gesture-Recognition. Zurich, 
Switzerland, June. 

Subramaniam, S., Biederman, I., Kalocsai, P., & Madigan, S. R. (1995). Accurate 
identification, but chance forced-choice recognition for RSVP pictures. Investigative 
Ophthalmology & Visual Science, 36, 377. 

Fiser, J., & Biederman, I. (1995). Priming with complementary gray-scale images in the spatial- 
frequency and orientation domains. Investigative Ophthalmology & Visual Science, 36, 475. 


Page 22 



Biederman: Final Progress Report ARO DAAH04-94-G-0065 July 6, 2000 


Cooper, E. E., Subramaniam, S., & Biederman, I. (1995). Recognizing objects with an irregular 
part. Investigative Ophthalmology & Visual Science, 36, 473. 

Kalocsai, P., & Biederman, I. (1995). Selective attention among presumed classifiers in the 
human face recognition system. Investigative Ophthalmology & Visual Science, 36, 374. 

Biederman, I., Gerhardstein, P. C., & Bar, M. (1995). An inadvertent experiment fails to 
confirm the employment of viewpoint dependent mechanisms in human object recognition. 
Investigative Ophthalmology & Visual Science, 36, 184. 

Biederman, I. (1995). From image edges to geons to viewpoint-invariant object representations. 
Invited address (featured speaker) presented to the Vision Society of Japan, Tokyo, January. 

Biederman, I. (1995). Invited panelist. Discussion of 3D object representation in the brain. 
ATR Symposium on Face and Object Recognition '95, Kansai, Japan, January. 

Biederman, I. (1995). Recognition of faces and objects: implications for a general theory of 
shape recognition. Invited presentation at the ATR Symposium on Face and Object 
Recognition '95, Kansai, Japan, January. 

Professional recognition based on ARO supported work: 

Currrent Editorial Boards: Psychological Review, Visual Cognition, Journal of Experimental 
Psychology: Human Perception and Performance; and British Journal of Psychology 

Elected to the Society of Experimental Psychologists (An honor society). 

Invited to be a Fellow at the Center for Advanced Study in the Behavioral Sciences, Stanford 
University. 

Interviewed by Anne Eisenberg, University of Iowa, as one of the 15 most influential cognitive 
scientists in the world as determined by citation counts and peer ratings. 

USC Associates Award for Creativity in Research ($5,000). 

Named to give the 2001 Broadbent Lecture at the Meetings of the European Society of Cognitive 
Psychology to be held in Edinborough, Scotland. 

1987 Psychological Review article, “Recognition-by-Components: A Theory of Human Image 
Understanding” deemed a “Classic” in a 1999 poll of visual perception scientists conducted by 
Professor Steven Yantis, Johns Hopkins University. 

Invited to Present the Following Featured or Keynote Address at Scientific 
Meetings 

Invited to present the Donald E. Broadbent Lecture at the European Conference on Cognitive 
Psychology, Edinborough, Scotland (July, 2001). 

Invited Keynote Speaker, 4th Annual Southern California Joint Symposium on Neural 
Computation. Los Angeles, CA, May, 1997. 

Invited address to the Royal Society of London, February, 1997. 

IEEE Computer Vision and Pattern Recognition Conference, San Francisco, CA June, 1996 
(Keynote Speaker on Workshop on Function, Formation, and Facilitation.) 

Attention & Performance Conference, Haifa, Israel, July 1996. (Featured speaker.) 

Meetings of the Vision Society of Japan, January, 1995. (Featured Speaker) 
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INVITED COLLOQUIA 

Weizmann Institute, Rehovat, Israel; Claremont College; Inserm, Cerveau and Vision, Lyon 
France; Inserm, Strausberg, France; University of Southhampton, England; UCLA; INSERM, 
Strasbourg; France; University of Glasgow, Scotland; University of St. Andrews, Scotland; 
Catholique University of Leuvan, Belgium; Birmingham University, England; University of 
California, Irvine; University of Michigan; Rutgers University (Nov ’97); Technische 
Universitaet Berlin (Dec '97); Szeged University (Hungary) (Dec *97); University of Louisville 
(March, '98); John Hopkins University (March, '98); Centre de Recherche Cerveau et Cognition, 
CNRS, Toulouse (April, '98); University of California, San Diego (Computer Science, March 
‘99); UCLA (Cognitive Science, November, ’99); University of British Columbia (Jan., ’00); 
Simon Fraser University (Jan. ’00); Claremont College, (Jan. ’00); Max Planck Institute, Munich 
(Feb. ’00); Medical University of Munich (Neurology) (Feb. ’00); Department of Physiology, 
Katholique University of Leuven, Belgium (March, 00); CNRS Marseille, France (April, 00). 


SCIENTIFIC PERSONNEL SUPPORTED BY THIS PROJECT AND DEGREES AWARDED 
DURING THIS REPORTING PERIOD: 

Jozsef Fiser, USC Graduate Student, Computer Science, Ph. D. 

Peter Kalocsai, USC Graduate Student, Psychology, Ph. D. 

Suresh Subramaniam, USC Graduate Student, Psychology, Ph. D. 

Moshe Bar, USC Graduate Student, Psychology, Ph. D. 

Michael C. Mangini, USC Graduate Student 

Edward A. Vessel, USC Graduate Student 

Nancy Wang, USC Undergraduate Student in Psychobiology, B. A. 

Trang Hong, USC Undergraduate Student in Psychology, B. A. 

Kathy Kreuzer, USC Undergraduate Student in Political Science, B.A. 

Viet Nguyen, USC Undergraduate Student in Classics 
Ali Narayan, USC Undergraduate Student in Psychology 
Henry Nguyen, USC Undergraduate Student in Engineering 
9. INVENTIONS: None. 


ABSTRACTS OF SELECTED PAPERS 

O'Kane, B., Biederman, I., Cooper, E. E., & Nystrom, B. (1997). An account of object 
identification confusions. Journal of Experimental Psychology: Applied, 3,21-41.. 
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In two experiments, trained military observers identified vehicles in infrared (thermal) imagery 
that varied in distance, signal-to-noise ratio, and orientation. A measure of shape similarity was 
derived from a contingency tree which allowed prediction of the confusion rates between any 
two vehicles based on the number of detectable, distinguishing parts. The mean confusion rates 
between pairs of vehicles was strongly correlated with the nodal distance between these vehicles 
in the similarity trees, even though the similarity trees had been constructed without knowledge 
of the confusion rates. Such trees offer the possibility for substantial improvements in the 
modeling of human object identification and, when incorporated into training programs, offer a 
high potential for reducing the likelihood of identification errors. 

Biederman, I., & Subramaniam, S. (1997). Predicting the shape similarity of objects without 
distinguishing viewpoint invariant properties (VIPs) or parts. Investigative Ophthalmology & 
Visual Science, 38, 998. 

Purpose . The similarity between complex shapes such as A&B, not distinguished by NAPs or 
part differences, would seem to be ineffable. Would a similarity measure based on a lattice of 
(wavelet-like) columns of Gabor filters at different scales and orientations, presumed to be a 
simplified model of early filtering (Lades et al., 1993), be correlated with the actual difficulty of 
distinguishing among such shapes? Methods . The similarity of each of the 81 Shepard & 
Cermak (1973) toroidal "free-form" stimuli were evaluated by the Lades et al model. The 
similarity value, percentage of maximum similarity, is a function of the sum of the differences in 
the activation values of the corresponding filters. The values of the four shapes relative to A is 
shown. Subjects performed same-different judgments of a pair of sequentially presented shapes, 
with SI =250 msec, Mask(=ISI)=125 msec, S2=200 msec, Mask 125 msec. On 50% of the trials 
S1=S2. Similarity on different trials ranged from 68 to 95. Results . At high levels of 
similarity, above 83, RTs and error rates on negative trials were almost perfectly correlated with 
the similarity values, rs=.96 and .95, respectively. Below 83, RTs were moderately related 
(r=.65), to similarity and errors were near floor. Conclusion . A similarity measure roughly 
characteristic of early cortical stage (VI or V2) filtering provides an excellent measure of the 
similarity of highly similar complex shapes. Slightly less similar shapes can activate different 
structural descriptions, e.g., "straight vs. curved big lobe," for a pair of objects, and render 
similarity less related to differences of early sp atial filter activations. 



Biederman, I., & Kalocsai, P. (1997). Neurocomputational bases of object and face recognition. 

Philosophical Transactions of the Royal Society London: Biological Sciences, 352, 1203- 

1219. 

Abstract. A number of behavioral phenomena distinguish the recognition of faces and objects, 
even when members of the set of objects are highly similar. Because faces have the same parts 
in approximately the same relations, individuation of faces typically requires specification of the 
metric variation in a holistic and integral representation of the facial surface. The direct mapping 
of a hypercolumn-like pattern of activation onto a representation layer that preserves relative 
spatial filter values in a 2D coordinate space, as proposed by C. von der Malsburg and his 
associates (Lades et al., 1993; Wiskott, et al., 1997), may account for many of the phenomena 
associated with face recognition. An additional refinement, in which each column of filters 
(termed "a jet") is centered on a particular facial feature (or fiducial point), allows selectivity of 
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the input into the holistic representation to avoid incorporation of occluding or nearby surfaces. 
The initial hypercolumn representation also characterizes the first stage of object perception, but 
the image variation for objects at a given location in a 2D coordinate space may be too great to 
yield sufficient predictability directly from the output of spatial kernels. Consequently, objects 
can be represented by a structural description specifying qualitative (typically, nonaccidental) 
characterizations of an object's parts, the attributes of the parts, and the relations among the parts, 
largely based on orientation and depth discontinuities (e.g., Hummel & Biederman, 1992). A 
series of experiments on the name priming or physical matching of complementary images (in 
the Fourier domain) of objects and faces documents that whereas face recognition is strongly 
dependent on the original spatial filter values, object recognition evidences strong invariance to 
these values, even when distinguishing among objects that are as similar as faces. 

Biederman, I., Subramaniam, S., Bar, M., Kalocsai, P, & Fiser, J. Subordinate-Level Object 

Classification Reexamined. (1998). Psychological Research, 62, 131-153. 

Abstract. The classification of table as round rather than square, a car as a Mazda rather than a 
Ford, a drill bit as 3/8 inch rather than 1/4 inch, and a face as Tom, have all been regarded as a 
single process termed "subordinate classification." Despite the common label, the considerable 
heterogeneity of the perceptual processing required to achieve such classifications requires, 
minimally, a more detailed taxonomy. Perceptual information relevant to subordinate level 
shape classifications can be presumed to vary on continua of: a) the type of distinctive 
information that is present, nonaccidental or metric, b) the size of the relevant contours or 
surfaces, and c) the similarity of the to-be-discriminated features, e.g., whether a straight contour 
has to be distinguished from a contour of low curvature vs. high curvature. We consider three, 
relatively pure, cases. Case 1 subordinates may be distinguished by a representation, a geon 
structural description (GSD), specifying a nonaccidental characterization of an object's large 
parts and the relations among these parts, such as a round table vs. a square table. Case 2 
subordinates are also distinguished by GSDs, except that the distinctive GSDs are present at a 
small scale in a complex object so the location and mapping of the GSDs are contingent on an 
initial basic-level classification, as when we use the logo to distinguish various makes of cars. 
Expertise for Cases 1 and 2 can be easily achieved through specification, often verbal, of the 
GSDs. Case 3 subordinates, which have furnished much of the grist for theorizing with "view- 
based" template models, require fine metric discriminations. Cases 1 and 2 account for the 
overwhelming majority of shape-based basic- and subordinate-level object classifications that 
people can and do make in their everyday lives. These classifications are typically made 
quickly, accurately, and with only modest costs of viewpoint changes. Whereas the activation of 
an array of multiscale, multiorientation filters, presumed to be at the initial stage of all shape 
processing, may suffice for determining the similarity of the representations mediating 
recognition among Case 3 subordinate stimuli (and faces), Cases 1 and 2 require that the output 
of these filters be mapped to classifiers that make explicit the nonaccidental properties, parts, and 
relations specified by the GSDs 

Bar, M., & Biederman, I. (1999). One-shot viewpoint invariance in matching novel objects. 

Vision Research , 39, 2885-2899.. 

Humans show an extraordinary competence at recognizing objects from arbitrary orientations in 
depth. According to one class of theories, this competence is based on previously having learned 
different templates, expressing the metric properties (MPs) at the different orientations. An 
alternative class of theories assumes that nonaccidental properties (NAPs) can be exploited so 
that even novel objects can be recognized under depth rotation. Same-Different judgments of a 
sequential pair of novel rotated objects, differing in a MP or a NAP (when different), viewed 
only once by each subject, revealed complete depth invariance when objects differed in a NAP. 
Rotation dramatically reduced the detectability of MP differences to a level well below that 
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expected by chance. NAPs offer a striking advantage over MPs for object classification and are 
therefore more likely to play a central role in the representation of objects. 

Subramaniam, S., Biederman, I., & Madigan, S. A. (2000). Accurate identification but no 
priming and chance recognition memory for pictures in RSVP sequences. Visual Cognition, 
7,511-535. 

Abstract. In 1969, Potter and Levy reported that recognition memory of accurately perceived 
RSVP pictures was extremely low, an effect that they attributed to disruption of memory 
consolidation. Here we report that the repetition of an RSVP picture (72-126 msec/picture) up to 
31 times prior to when it became a target had no effect on identification accuracy. At these rates, 
forced choice recognition memory was at chance. Single presentations of the pictures outside of 
the RSVP sequences readily resulted in substantial priming of their identification within the 
sequences. We offer a neural interpretation of Potter and Levy’s explanation, as well as 
contemporary two-stage accounts of RSVP memory and attentional phenomena, based on the 
recent finding (Tovee & Rolls, 1995) that most of the information in inferior temporal cells is 
conveyed in the first 50 msec of firing but the cells continue their activity for an additional 350 
msec. The additional activity, by our account, is required for memory and it is this activity that 
may be disrupted by attention to the next image during RSVP presentations. The critical factor 
for priming, if not memory in general, may be attention to the stimulus for a few hundred msec 
beyond that which is required for its identification. Single trial presentations thus manifest 
robust memory and priming effects—even when the stimulus cannot be identified—while RSVP 
conditions in which the stimulus can be identified result in poor memory. 

Bar, M., & Biederman, I. (1998). Subliminal visual priming. Psychological Science, 9, 464- 
469. 

Masked pictures of objects were flashed so briefly that only 13.5% of them could be named. 
Forced-choice accuracy for the unidentified objects was at chance. When shown again, about 15 
minutes and 20 intervening trials later, without any indication of possible repetitions, 
identification accuracy increased to 34.5%. The priming was completely visual, rather than 
semantic or verbal, as there was no priming of same name-different shaped images. This is the 
first demonstration of facilitatory visual recognition priming by unidentified pictures when the 
subject could not anticipate if, when, or where the previously unidentified picture was to be 
shown again. A change in the position of the object reduced but did not eliminate the priming, 
allowing a speculation that the locus of subliminal visual priming is at an intermediate stages in 
the ventral cortical pathway for shape recognition. 

Bar, M., & Biederman, I. (1999). Localizing the cortical region mediating visual awareness of 
object identity. Proceedings of the National Academy of Sciences, 96, 1790-1793. 

Presentations of pictures that are too brief to be recognized, or even guessed above chance on a 
forced-choice test, can nonetheless facilitate the recognition of the same pictures many trials 
later. This subliminal visual priming was compared for images translated 4.8° either between or 
within quadrants of the visual field. Priming was evident only for images that remained within 
the same quadrant on priming and test trials. Consequently, subliminal visual priming is likely 
mediated by cortical areas in which a substantial portion of the cells have receptive fields (RFs) 
large enough to respond to both presentations of a stimulus translated almost 5° yet where the 
RFs are confined to a single quadrant, viz., the human homologue of macaque V4 or TEO (the 
posterior part of the inferior temporal cortex). Awareness of object identity might therefore be 
associated exclusively with activity at or beyond the anterior part of the inferotemporal cortex. 
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Vogels, R., Biederman, I., Bar, M, & Lorincz, A. (2000). Inferior temporal neurons show 
greater sensitivity to nonaccidental than metric differences. Journal of Cognitive 
Neuroscience, In press. 

It has long been known that macaque inferior temporal (IT) neurons tend to fire more strongly to 
some shapes than to others and that other neurons can show markedly different shape 
preferences. Beyond the discovery that these preferences can be elicited by features of moderate 
complexity, no general principle of (non-face) object recognition had emerged by which this 
enormous variation in selectivity could be understood. Computational as well as psychophysical 
work suggests that one such principle is the difference between view-invariant, non-accidental 
(NAP) and view-dependent, metric shape properties (MPs). We measured the responses of 
single IT neurons to objects differing in either a NAP or a MP of a single part, shown at two 
orientations in depth. The cells were more sensitive to NAP than MP changes, even when the 
image variation produced by the former were smaller than the latter. Multidimensional scaling 
of the neural responses indicated a NAP/MP dimension, independent of an orientation 
dimension. The present results thus demonstrate that a significant portion of IT’s neural code 
represents differences in NAPs rather than MPs. This code may enable a) immediate recognition 
of novel objects at new views, and b) "object constancy," the subjective phenomena that objects 
undergoing depth rotation do not appear to change their shape despite dramatic changes in the 
retinal image produced by such rotations. 
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